Traffic fatalites are a major cause of death in the United States, especially a leading cause of death in the first three decades of life [1]. Alcohol is involved in almost one third of the traffic fatalities nationwide. In 2016, 10,497 people died in alcohol-impaired driving crashes, accounting 28% of traffic fatalities in US [2]. It has been shown that a variety of individual skills would be impaired with a blood alcohol concentration (BAC) level well below 0.05%, and the risk of crashes increased exponentially [3].
From 1980s to 2010s, the deaths resulting from motorvehicle collisions declined by nearly 35% [4]. It coincided with the period of time (1980-1985) when many states made considerable amount of legislative reforms in order to reduce drunken driving and fatal automobile crashes: increasing minimum drinking age to 21, adopting criminal and administrative per se laws, and instituting penalty increases for drunken driving [5]. Many previous studies had shown that the adoption of alcohol regulations and alcohol-impaired driving laws could reduce fatal car crashes: defining BAC limits for drivers at 0.08 or lower [6]; minumum drinking age (21) laws and zero-tolerance laws dor younger drivers [7], fines and jail sentences for alcohol-impaired driving [8].
Although given these evidence for the effectiveness of adopting alcohol regulations and drunken driving punishment laws on reducing alcohol-related traffic fatalities, the traffic policy enviornment in diffrent states often varied a lot, and many socioecnomic factors that can potentially impact alcohol-involved traffic fatalies are not consistent across states. Therefore, the objective of the current project is to investigate the effect of alcohol regulation and alcohol-impaired driving laws on the proportion of alcohol-involved fatalities among all traffic fatalities, controlling for state-specific unobservable confounders.
In 2016, 10497 people were killed in alcohol-impaired driving crashes. This means that there is an average of 1 alcohol-impaired driving fatality every 50 minutes. Of all vehicle fatalities, 28% of them involve alcohol. The data for this project comes from a study in the Journal of Health Economics. However, alcohol related traffic incidents are not anything new. In 1996, a study looked into vehicle fatalities from 1982 to 1988. The data was obtained from the National Highway Traffic Safety Administration’s Fatal Accident Reporting System (FARS). They conducted a population-based study that observed traffic fatalities from 48 states (excludes Alaska, Hawaii, and District of Columbia) over 7 years There were 336 observations on 34 different variables, however California had missing values in the year 1988 so that observation was omitted leading to 335 observations. The reason this study is important is that these were the last years that states had differing minimum drinking ages. After 1988, the drinking age increased to 21 years old throughout the United States. The predictor variables that were examined in the study were state, year, spirits consumption, unemployment rate, per capita personal income, employment//population ratio, tax on a case of beer, percent of people who were Southern Baptist, percent of people who were Morman, the minimum legal drinking age, percent residing in “dry” counties, percent of drivers aged 15-24, average miles per driver, whether the state had a mandatory jail sentence, whether the state had mandatory community service, the population of each state, the population of 15 to 17 year old people in each state, , the population of 18 to 20 year old people in each state, , the population of 21 to 24 year old people in each state, the total vehicle miles (millions), the US employment rate, the US employment/population ratio, and GSP rate of charge. The response variables were number of fatalities, number of nighttime fatalities, number of single fatalities, fatalities among 15 to 17 year old individuals, fatalities among 18 to 20 year old individuals, fatalities among 21 to 24 year old individuals, and alcoholic related fatalities. The study also examined the nighttime fatalities involving the three different age groups (15-17, 18-20, 21-24). The objective of this project is to build a predictive model from the data of the previous study that predicts the percentage of vehicle fatalities that are alcohol related. The predictor variables that are being considered are alcohol related. They are jail, service, tax on beer (beertax), minimum drinking age (drinkage), percent residing in dry counties (dry), and preliminary breath test law (breath). However after some further research it was found that unemployment is also positively correlated with alcohol misuse so unemployment was added as a variable as well. The response variables will be alcoholic fatality rate (alcoholic fatalities per 10000 people), fatality rate (vehicle fatalities per 10000 people), and proportion of alcoholic fatalities (alcoholic fatalities over total fatalities). afatal (number of alcohol-involved vehicle fatalities), pop (population), state (factor indicating state), year (factor indicating year), spirits (spirits consumption), beertax (tax on case of beer), drinkage (minimum legal drinking age), dry (percent residing in “dry” counties), unemp (unemployment rate), jail (mandatory jail sentence), service (mandatory community service), and breath (preliminary breath test law). The unit of analysis was people in each state for each year so the variables state and year were included from the data set. According to a 2017 study by WalletHub there were significant difference in strict DUI and DWI laws in the 50 states and the District of Columbia. States with the strictest laws were Arizona, Georgia, Alaska, Kansas, and Oklahoma. States with the most lenient laws were Idaho, North Dakota, Ohio, District of Columbia, and South Dakota. This means that states could be a factor in proportion of alcoholic fatalities. The data was accessed using the AER package in R.
In response to the goal of this analysis, there are various metrics that one can use to understand the severity of alcohol-related vehicle fatalities throughout the USome of which include the number of alcohol-related fatalities per state, or the rate of alcohol-related fatalities per 10k people, or even the ratio between alcohol-related fatalities with overall fatalities. Each metric will produce different results and different interpretation.
To get a good sense on how the type of metric can affect the interpretation of the outcome, we observe the map below, which plots each state in the USA with different shades of purple to signify the different mean proportion of alcohol-related fatalities by state. As we hover over the following map, we can see that California (CA) has one of the largest number of alcohol-related fatalities in the country (averaging more than 5000 deaths per year), yet its rate of alcohol-related fatalities per 10k people of 1.9 is almost half of that of Texas (TX) (3.6 alcohol-related fatalities per 10k people). On the other hand, if we were to compare the ratio of alcohol-related fatalities to overall fatalities, both California (0.26) and Texas' (0.41) statistics were far below that of Mississippi's (0.52).
These differences show the importance of choosing the suitable metric for the purpose of this data analysis. The choice of metric will be discussed in Section 4.2.1.2.
As the focus of this data analysis is to find out whether laws that were implemented to tackle drunk driving related fatalities, only a subset of the variables from the Fatalities dataset were used. In particular, response variables that were alcohol-related such as the total number of fatalities and alcohol fatalies were examined, while predictor variables that are closely related to alcohol-consumption-driven laws were also analyzed.
We start off the exploratory data analysis procedure by individually examining the predictor and response variables. The goal here is to understand how the data is distributed, which helps set an expectation on how the variables correlate with each other, or whether certain model assumptions will be met.
As we look into how alcohol consumption driven laws impact the rate of alcohol-related fatalities, some variables of interest include spirit consumptions, beer tax, proportion of the population living in dry counties, minimum drinking age, and the mandatory punishments implemented by each state throughout the 7 years.
The plot below shows the top 5 states in terms of average spirits consumptions, average beer tax, and average proportion of population living in dry counties between 1982 and 1988. Other than North Carolina (NC) being in the top 5 states for beer tax and containing large proportion of dry residents, it can be seen that there is no other "standout" state below, ie. there's no state present in more than one of the top 5 categories.
On the other hand, it can be seen that there has been an increasing implementation/tightening of laws throughout the 7 years. The most obvious changes here is the number of states that increased the minimum drinking age. In 1982, almost half the country had set their minimum drinking age to be less than 21, and yet most of the states have opted for 21 to be the minimum drinkage 7 years later.
Additionally, there seem to be a slight increasing trend in the number of states that implement testings (breath test) and punishments (mandatory jail sentence and mandatory community services) between 1982 and 1988. We need to note, however, that the number of states implementing mandatory jail sentences decreased very slightly from 1986 to 1988. This raises the question of whether a mandatory jail sentence is effective in combating the issue of drunk driving. Such questions will be addressed after fitting a suitable model.
After observing the trend of the implemented laws, the focus is now switched to analyzing the distribution of fatalities and alcohol fatalities across the country. A quick look at the top two histograms below might suggest that a large portion of states have less than 1000 fatalities per year, and less than 500 alcohol related fatalities per year. However, each state's population need to be taken into account in this case due to the significant variation in population sizes across the country. Our new histograms (bottom two) tell us that the distributions of the data can be approximated as normal.
Since the goal of this analysis is to discuss the effects of alcohal-related law implementations, the alcohol-related fatalities becomes our main topic of interest. There are a number of approaches in determining the best metric for observing such specific fatalities, among which is the number of alcohol related fatalities per 10k people. However, the issue with such a metric is that it does not tell a good story on whether the implemented traffic policies had success in reducing the number of alcohol-related fatalities. Other factors could have come into play which resulted in a lower overall fatality rate in general, which in turn affects the alcohol-related fatalities rate.
In this analysis, the metric used for analyzing the effect of traffic laws on alcohol-related fatalities is the proportion of alcohol-related fatalities among the overall fatalities by state and year, ie. \[
p = \frac{\text{Number of alcohol-related fatalities}}{\text{Number of overall fatalities}}
\] An advantage of this metric for the purpose of this data analysis is its robustness to the changes in overall fatalities. In other words, \(p\) is still able to give us useful and unbiased information on alcohol-related fatalities in response to the changes in overall fatalities in certain states or years.
The plot below shows the proportion of alcohol-related fatalities of each state throughout the years. It can be seen that the proportion of alcohol-related fatalities have been either constant or decreasing in those 7 years. This is more prevalent in states such as Kansas (KS), North Dakota (ND), and Arkansas(AR). However, there is one exception to this trend. In the line plot below, we observe that Mississipi had a significant increase in the proportion of alcohol fatalities from 1983 to 1988.
In this section, we will observe the pairwise interaction between the predictor and response variables. The first thing that was done was to examine the correlation between each pair of continuous predictor variables. In the scatterplot matrix below, it is obvious that there were no distinct patterns between the predictor variables throughout all years, which suggest low correlation between all pairs of continuous predictors.
Another thing we want to ensure prior to fitting any models in this analysis is the non-presence of the variance inflation factor (VIF). The VIF of the \(k\)th predictor, denoted as \(VIF_k\), is defined as \[ VIF_k = \frac{1}{1-R_k^2} \] where \(R_k^2\) is the coefficient of multiple determination when the predictor variable \(X_k\) is regressed onto the rest of the \(X\) variables. Intuitively, a large \(VIF_k\) value means that the predictor \(X_k\) can be well explained by the other \(X\) variables, which would ultimately lead to the multicollinearity phenomenon. Notice that \(R_k^2 \geq 0\), and therefore \(VIF_k \geq 1\). This means that we want to obtain \(VIF\) values that are as small as possible and as close to 1 as possible to prevent multicollinearity.
In the table below, we see that all the \(VIF_k\) values are close to 1, signifying that multicollinearity is not an issue in this data set.
With the predictor variables analyzed, we now proceed to the pairwise interactions between those predictor variables and the proportion of alcohol-related fatalities. In the series of boxplots below, we gain some insights that one would generally expect:
On the other hand, an interesting finding from these boxplots that mandatory jail sentences somehow correlate with a larger proportion of alcohol-related fatalities. This unexpected finding could also be the reason that the number of states implementing such policy decreased between 1986 and 1988, as shown in the previous section.
After observing the change in proportion of alcohol-related fatalities in response to categorical variables and time, we then aim to do the same with the continuous predictor variables. In the interactive scatter plots below, each data point is colored by its implementation of the madatory jail sentence due to the unexpected findings from previous sections. While no concrete conclusions can be made in that regard, it can be seen that the data points in all four plots converge to the bottom left corner as the years go by. This tells us that:
Based on what we have seen so far, it does seem as if these policies did gain a positive effect in the long run. Even if the decreasing amount of beer tax resulting in a decreased proportion of alcohol-related fatalities seemed non-intuitive, it could be due to the fact that there is a time-lag/latency element to consider where people take time to adjust to high beer taxes before tending to lower alcohol purchase and intake, which ultimately results in lower alcohol-related fatalities.
Lastly, we turn to analyzing the distribution of young drivers (aged between 15-24) throughout the years. While this may not be directly correlated with the current data analysis, it would be interesting to see if the change in the minimum drinking age has any effect on the distribution of young drivers. As expected, with the increase in the minimum drinking age, the proportion of young drivers decreased. This could be due to the decrease in proportion of legal young drivers, which in turn correlates with the decreasing proportion of alcohol related fatalities (shown in scatterplots on the right).
With all of those in mind, we can then move on to fitting an appropriate model and produce some causal inference for this data.
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = afatal_prop ~ spirits + unemp + beertax + drinkage +
## dry + breath + jail + service, data = data, model = "within",
## index = c("state", "year"))
##
## Unbalanced Panel: n = 48, T = 6-7, N = 335
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -3.5395e-01 -2.3925e-02 -5.1855e-05 1.7001e-02 2.1630e-01
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## spirits 0.0856920 0.0248301 3.4511 0.0006453 ***
## unemp 0.0035755 0.0021975 1.6271 0.1048604
## beertax 0.0098449 0.0531128 0.1854 0.8530833
## drinkage19 0.0250474 0.0205063 1.2214 0.2229548
## drinkage20 0.0172362 0.0224746 0.7669 0.4437832
## drinkage21 0.0114565 0.0215649 0.5313 0.5956668
## dry -0.0013173 0.0041761 -0.3154 0.7526623
## breathyes -0.0154861 0.0156792 -0.9877 0.3241671
## jailyes 0.1367219 0.0385335 3.5481 0.0004555 ***
## serviceyes -0.1369902 0.0445419 -3.0755 0.0023113 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 0.85673
## Residual Sum of Squares: 0.70232
## R-Squared: 0.18023
## Adj. R-Squared: 0.011544
## F-statistic: 6.09009 on 10 and 277 DF, p-value: 2.2104e-08
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## spirits 0.1023760 0.0227069 4.5086 9.605e-06 ***
## drinkage19 0.0232031 0.0204567 1.1343 0.2576570
## drinkage20 0.0159299 0.0223372 0.7132 0.4763429
## drinkage21 0.0053786 0.0212524 0.2531 0.8003893
## jailyes 0.1406426 0.0384631 3.6566 0.0003053 ***
## serviceyes -0.1457703 0.0441046 -3.3051 0.0010731 **
## breathyes -0.0160508 0.0155954 -1.0292 0.3042731
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = afatal_prop ~ spirits + drinkage + jail + service +
## breath, data = data, model = "within", index = c("state",
## "year"))
##
## Unbalanced Panel: n = 48, T = 6-7, N = 335
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -0.3498903 -0.0244461 0.0016817 0.0176869 0.2233693
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## spirits 0.1023760 0.0227069 4.5086 9.605e-06 ***
## drinkage19 0.0232031 0.0204567 1.1343 0.2576570
## drinkage20 0.0159299 0.0223372 0.7132 0.4763429
## drinkage21 0.0053786 0.0212524 0.2531 0.8003893
## jailyes 0.1406426 0.0384631 3.6566 0.0003053 ***
## serviceyes -0.1457703 0.0441046 -3.3051 0.0010731 **
## breathyes -0.0160508 0.0155954 -1.0292 0.3042731
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 0.85673
## Residual Sum of Squares: 0.70988
## R-Squared: 0.1714
## Adj. R-Squared: 0.011602
## F-statistic: 8.27436 on 7 and 280 DF, p-value: 3.5305e-09
## al az ar ca co ct de
## 0.1876225 0.0555890 0.2743353 0.0586928 0.2529027 0.1019978 0.0607193
## fl ga id il in ia ks
## 0.1964654 0.0996590 0.1662379 0.1035450 0.1764117 0.2687449 0.3228548
## ky la me md ma mi mn
## 0.1837777 0.2166731 -0.0202843 0.0315205 0.0665882 0.1570624 0.0899452
## ms mo mt ne nv nh nj
## 0.3849691 0.1970770 0.0405722 0.1417193 -0.1655356 -0.0959794 -0.0034107
## nm ny nc nd oh ok or
## 0.1216148 0.0188522 0.1442105 0.2825164 0.2457687 0.1330775 0.1143713
## pa ri sc sd tn tx ut
## 0.1927460 0.1192324 0.1717035 0.1347087 0.0460884 0.2534396 0.1270383
## vt va wa wv wi wy
## 0.1017130 0.1246159 0.0053047 0.0974944 0.1433641 -0.0121222
##
## F test for individual effects
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath
## F = 10.299, df1 = 44, df2 = 280, p-value < 2.2e-16
## alternative hypothesis: significant effects
##
## Hausman Test
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath
## chisq = 134.05, df = 7, p-value < 2.2e-16
## alternative hypothesis: one model is inconsistent
##
## F test for individual effects
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath + ...
## F = 2.6916, df1 = 6, df2 = 274, p-value = 0.01484
## alternative hypothesis: significant effects
##
## Lagrange Multiplier Test - time effects (Breusch-Pagan) for unbalanced
## panels
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath
## chisq = 10.272, df = 1, p-value = 0.001351
## alternative hypothesis: significant effects
##
## Breusch-Pagan LM test for cross-sectional dependence in panels
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath
## chisq = 1400, df = 1128, p-value = 4.743e-08
## alternative hypothesis: cross-sectional dependence
##
## Pesaran CD test for cross-sectional dependence in panels
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath
## z = 2.8788, p-value = 0.003992
## alternative hypothesis: cross-sectional dependence
##
## Breusch-Godfrey/Wooldridge test for serial correlation in panel models
##
## data: afatal_prop ~ spirits + drinkage + jail + service + breath
## chisq = 45.648, df = 6, p-value = 3.478e-08
## alternative hypothesis: serial correlation in idiosyncratic errors
##
## Augmented Dickey-Fuller Test
##
## data: Panel.set$afatal_prop
## Dickey-Fuller = -7.1009, Lag order = 2, p-value = 0.01
## alternative hypothesis: stationary
## spirits drinkage19 drinkage20 drinkage21 breathyes jailyes serviceyes
## 1.473953 3.867132 3.205066 5.955352 1.068955 4.334770 4.384311
\(Y_{it}=\alpha_i+\beta_1 X_{it,1}+\beta_2 X_{it,2}+\beta_3 X_{it,3}+\beta_4 X_{it,4}+\beta_5 X_{it,5}++\beta_6 X_{it,6}+\beta_7 X_{it,7}+\epsilon_{it}\), where \(i=1,2\dots,48\), \(t=1,2,\dots,6,7.\)
Parameter notation:
\(i\): state index; \(t\): time index.
\(Y_{it}\): ratio of number of alcohol-involved vehicle fatalities to overall vehicle fatalities for state \(i\) and year \(t\).
\(\alpha_i\): State-specific parameter (unobserved time-invariant individual effect) for state \(i\).
\(\beta_1,\beta_2,\beta_3,\beta_4,\beta_5,\beta_6,\beta_7\) represents for spirits, drink-age 19, drink-age 20, drink-age 21, jail, service, breath, respectively.
Notice: since we dropped one observation with missing values (CA 1988), we have unbalanced panel data.
Assumptions:
Conditional relationship of \(Y_{it}\) given {\(X_{it,1},\dots, X_{it,7}\)} is linear in the explanatory variables.
\(\epsilon_{it}\) are independent random variables with zero mean and constant variance: \(E(\epsilon_{it})=0, Var(\epsilon_{it})=\sigma^2\).
Model congruence examination:
Test for heterogeneity: In order to know whether the intercepts for each states are equal, F test for individual effects was conducted. \(H_0:\alpha_1=\dots=\alpha_{48}\) vs. \(H_0:\) not all \(\alpha_i\) are equal.
P-value \(< 2.2e-16\) indicates rejecting the null hypothesis.
Test for fixed effect over random effect model: To decide between fixed and random effects, we used Hausman test and have hypothesis \(H_0:\) Random effects vs. \(H_a\): Fixed effects.
P-value \(< 2.2e-16\) means fixed effects model is more appropriate.
Test for time-fixed effects: \(H_0:\) need time-fixed effects vs. \(H_a:\) no need for time-fixed effects. Both F-test for individuals effects (p-value=0.015) and Largrange Multiplier test (p-value=0.001) show no need to use time-fixed effects.
Testing for cross-sectional dependence: Though, in our case, we have micro panels with few years and large number of subject/states, residuals across entities are not correlated was test cautiously.
Unfortunately, both Breusch-Pagan LM test and Pesaran CD test show cross-sectional dependence. Potential reasons and consequences of this dependence will be discussed in section discussion.
Testing for unit roots: Since the Augmented Dickey-Fuller test (p-value <0.01) indicates no unit roots present, no further transformation of variable are needed.
Multicollinearity: Since VIF scores are relatively small, we concluded there is no multicollinearity.
Test for homoscedasticity: based on residuals plots, we can see as overall spread incthe rease as fitted value increase, thus we suspect that the homoscedasticity assumption might not holds.
Based on the final model, three factors were significant in explaining the change in alcohol-related fatalities ration: spirits consumption, mandatory jail sentence, and mandatory community service for driving-under-influence, after controlling for preliminary breath test law, and minimum drinking age. The assumptions needed for causal inference, problems with current model, and possible solutions were discussed here.
In panel data, measures were conducted on the same entity (state) repeatedly at different time points (years). The fixed effect model adopted in the current project accounted for unobserved, entity-specific, time-invariant confounders. Given these features, this model controlled for the effect of some unmeasured or un-measurable factors that differed across states on alcohol fatalities ratio. Therefore, if no model assumptions were violated, then we should be able to make causal inference on the significant factors within each state. Our model essentially explained how the change in the response variable overtime specific to a state (\(Y_{it}-\bar{Y}_i\)), could be explained by the change in predictors of that state (\(X_{it}-\bar{X}_i\)).
However, the model diagnosis result showed that our model had cross-sectional dependence, which is a violation of the model assumption that states should be independent of each other. Although each state had their own legislative system, the federal law can still largely affect state laws by witholding or providing funding to encurage the passage of a law. For example, in 1984, federal legislation prompted all 50 states to adopt the 21 minimum legal drinking age law by 1988 [9]. Thus under the same big policy environment, it is unlikely that each state's legislation were uncorrelated. Because the model adopted in the current project assumed independent entities, violation of such assumption might lead to bias in our results that invalidate causal inference.
Besides the original model assumptions, the fixed effect model also requires strong exogeneity in order to make causal inference, including: (a) no unobserved time-varying confounders; (b) past outcomes do not directly affect current outcome; (c) past treatments do not directly affet current outcome; (d) past outcome do not directly affect current treatment (reverse causation) [10]. Assumption (a) is hard to verify and also difficult to relax under the fixed effect model. Thus we assumed no time-varying covariates were omitted from the current model and see whether the other assumptions were violated in the current model and how they can be relaxed.
Assumption (b) can be relaxed without interfering with the causal inference between current treatment and current outcome so long as we condition on past treatment, and assuming past outcome does no directly affect current treatment. For assumption (c), it is highly likely that our model did not conform to this assumption. It is natural for laws and regulations to have a lagged effect: the laws passed this year might not have an effect until the next. To relax assumption (c), we could add a small number of lagged treatment effect into the model (e.g. treatment from the year before) for the "breath","jail", and "service" predictors. Last, for assumption (d): no reverse causation, a popular approach to relax it is to include instrumental variables for endogenous predictors. Endogenous predictors are those included in the model but are correlated with the error term. This could happen when the response variable can reversely cause the predictor, or some omitted confounders can affect both dependent and independent variables. Instrumental variables were those not included in the model, associated with the endogenous predictor, but not associated with the unobserved confounders.
The factor of concern for violation of assumption (d) is spirits consumption. Some previous studies on the traffic policy environment and fatality rate suggested using alcohol regulations as instrumental variables for alcohol consumption when investigating the effect of alcohol consumption on traffic accidents fatality. Such alcohol regulations can only affect traffic accident fatality through alcohol consumption, and there were previous studies showing significant effect of such regulations on alcohol consumption. In the current dataset, the covariate related to alcohol consumption is "spirits", and alcohol regulations include "drinkage" (minimum drinking age), and "beertax". To verify the approporiateness of drinkage and beertax as instrumental variables for spirits consumption, under-identification, weak instrument, and over-identification need to be tested. To test for under-identification is to test the null hypothesis that spirits and beertax or drink age are irrelevant. This could be done through simple t-test and likelihood ratio test. The result showed that beertax was not associated with spirits consumption (Pr(>F) = 0.1012), but drinkage had significant effect (Pr(>F) <0.0001). Thus, beertax failed the under-idetification test. Weak instrument was tested by calculating Cragg-Donald F statistic and comparing it against Stock and Yogo critical values. The null hypothesis (the instrumental variables are weak) can be rejected if the Crgg-Donald F statistic is greater than the criticla value. The Cragg-Donald F statistic calculated for drinkage was 10.59, and the critical value was 22.3, thus we failed to reject the null at significance level 0.05. As a result, we could not find appropriate instrumental variables for spirits in the current dataset. If more measures are availble, such as other alcohol regulations and other alcohol consumption information, we might be able to find more suitable instrument variables.
[1] National Center for Health Statistics (US. (2011). Health, United States, 2010: With special feature on death and dying. [2] Facts, T. S. (2012). Alcohol-Impaired Driving. DOT HS, 811, 630. [3] Desapriya, E. B., Iwase, N., & TAYE, B. N. (2002). Alcohol related traffic safety legislation: Where do we stand today?. IATSS research, 26(2), 76-84. [4] National Highway Traffic Safety Administration. Fatality Analysis Reporting System (FARS). Washington, DC: cited 2013 March; Available from: http://www.nhtsa.gov/FARS [5] Hingson, R. W., Howland, J., & Levenson, S. (1988). Effects of legislative reform to reduce drunken driving and alcohol-related traffic fatalities. Public Health Reports, 103(6), 659. [6] Wagenaar, A. C., Maldonado-Molina, M. M., Ma, L., Tobler, A. L., & Komro, K. A. (2007). Effects of legal BAC limits on fatal crash involvement: analyses of 28 states from 1976 through 2002. Journal of safety research, 38(5), 493-499. [7] Voas, R. B., Tippetts, A. S., & Fell, J. C. (2003). Assessing the effectiveness of minimum legal drinking age and zero tolerance laws in the United States. Accident Analysis & Prevention, 35(4), 579-587. [8] Wagenaar, A. C., Maldonado-Molina, M. M., Erickson, D. J., Ma, L., Tobler, A. L., & Komro, K. A. (2007). General deterrence effects of US statutory DUI fine and jail penalties: long-term follow-up in 32 states. Accident Analysis & Prevention, 39(5), 982-994. [9] Silver, D., Macinko, J., Bae, J. Y., Jimenez, G., & Paul, M. (2013). Variation in US traffic safety policy environments and motor vehicle fatalities 1980–2010. Public health, 127(12), 1117-1125. [10] Imai, K., & Kim, I. S. (2019). When should we use unit fixed effects regression models for causal inference with longitudinal data?. American Journal of Political Science, 63(2), 467-490.
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.4
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] faraway_1.0.7 tseries_0.10-48 MASS_7.3-53 plm_2.4-0
## [5] gplots_3.1.1 panelr_0.7.5 lme4_1.1-26 Matrix_1.2-18
## [9] GGally_2.1.0 forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2
## [13] purrr_0.3.4 readr_1.4.0 tidyr_1.1.2 tibble_3.0.4
## [17] tidyverse_1.3.0 plotly_4.9.3 ggplot2_3.3.3 AER_1.2-9
## [21] survival_3.2-7 sandwich_3.0-0 lmtest_0.9-38 zoo_1.8-8
## [25] car_3.0-10 carData_3.0-4
##
## loaded via a namespace (and not attached):
## [1] minqa_1.2.4 colorspace_2.0-0 ggsignif_0.6.0 ellipsis_0.3.1
## [5] rio_0.5.16 fs_1.5.0 rstudioapi_0.13 ggpubr_0.4.0
## [9] farver_2.0.3 fansi_0.4.1 lubridate_1.7.9.2 xml2_1.3.2
## [13] splines_4.0.3 knitr_1.30 Formula_1.2-4 jsonlite_1.7.2
## [17] nloptr_1.2.2.2 broom_0.7.3 dbplyr_2.0.0 compiler_4.0.3
## [21] httr_1.4.2 backports_1.2.1 assertthat_0.2.1 lazyeval_0.2.2
## [25] cli_2.2.0 htmltools_0.5.0 tools_4.0.3 gtable_0.3.0
## [29] glue_1.4.2 Rcpp_1.0.5 cellranger_1.1.0 vctrs_0.3.6
## [33] nlme_3.1-149 crosstalk_1.1.1 gbRd_0.4-11 xfun_0.19
## [37] rbibutils_2.0 openxlsx_4.2.3 rvest_0.3.6 lifecycle_0.2.0
## [41] gtools_3.8.2 statmod_1.4.35 rstatix_0.6.0 scales_1.1.1
## [45] miscTools_0.6-26 hms_0.5.3 RColorBrewer_1.1-2 quantmod_0.4.18
## [49] yaml_2.2.1 curl_4.3 gridExtra_2.3 pander_0.6.3
## [53] bdsmatrix_1.3-4 reshape_0.8.8 stringi_1.5.3 TTR_0.24.2
## [57] caTools_1.18.1 boot_1.3-25 zip_2.1.1 Rdpack_2.1
## [61] rlang_0.4.10 pkgconfig_2.0.3 bitops_1.0-6 evaluate_0.14
## [65] lattice_0.20-41 htmlwidgets_1.5.3 labeling_0.4.2 cowplot_1.1.1
## [69] tidyselect_1.1.0 plyr_1.8.6 magrittr_2.0.1 R6_2.5.0
## [73] generics_0.1.0 DBI_1.1.0 pillar_1.4.7 haven_2.3.1
## [77] foreign_0.8-80 withr_2.3.0 xts_0.12.1 jtools_2.1.2
## [81] abind_1.4-5 modelr_0.1.8 crayon_1.3.4 KernSmooth_2.23-17
## [85] rmarkdown_2.6 maxLik_1.4-6 readxl_1.3.1 data.table_1.13.6
## [89] reprex_0.3.0 digest_0.6.27 munsell_0.5.0 viridisLite_0.3.0
## [93] quadprog_1.5-8